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Cells in a developing embryo have no direct way of "measuring" their physical position. Through 
a variety of processes, however, the expression levels of multiple genes come to be correlated with 
position, and these expression levels thus form a code for "positional information." We show how 
to measure this information, in bits, using the gap genes in the Drosophila embryo as an example. 
Individual genes carry nearly two bits of information, twice as much as expected if the expression 
patterns consisted only of on/off domains separated by sharp boundaries. Taken together, four 
gap genes carry enough information to define a cell's location with an error bar of ~ 1% along the 
anterior-posterior axis of the embryo. This precision is nearly enough for each cell to have a unique 
identity, which is the maximum information the system can use, and is nearly constant along the 
length of the embryo. We argue that this constancy is a signature of optimality in the transmission 
of information from primary morphogen inputs to the output of the gap gene network. 



I. INTRODUCTION 

Building a complex, differentiated body requires that 
individual cells in the embryo make decisions, and ulti- 
mately adopt fates, that are appropriate to their position. 
There are wildly diverging models for how cells acquire 
this "positional information" pQ, but there is a general 
consensus that they encode positional information in the 
expression levels of various key genes. A classic exam- 
ple is provided by anterior-posterior patterning in the 
fruit fly, Drosophila melanogaster, where a small set of 
gap genes, and then a larger set of pair rule and segment 
polarity genes, are involved in the specification of the 
body plan [2J. These genes have expression levels which 
vary systematically along the body axis, forming an ap- 
proximate blueprint for the segmented body of the fully 
developed larva that we can "read" within hours after 
the start of development [3]. 

Although there is consensus that particular genes carry 
positional information, much less is known quantitatively 
about how much information is being represented. Do 
the relatively broad, smooth expression profiles of the gap 
genes, for example, provide enough information to spec- 
ify the exact pattern of development, cell by cell, along 
the anterior-posterior axis? How much information does 
the whole embryo actually use in making this pattern? 
Answering these questions is important, in part, because 
we know that crucial molecules involved in the regula- 
tion of gene expression are present at low concentrations 
and even low absolute copy numbers, so that expression is 
noisy [31-fTTj] , and this noise must limit the transmission of 
information [TTHT1] . Is it possible, as suggested theoret- 
ically [15T - H8] . that the information transmitted through 
these regulatory networks is close to the physical limits 
set by the bounded concentrations of the different tran- 
scription factors? To answer this and other questions, we 
need to measure positional information quantitatively, in 
bits. We do this here using the gap genes in Drosophila 



as an example. 



II. QUANTIFYING INFORMATION 

Before we observe the expression levels of the relevant 
genes, we have no information about the position of the 
cell — it could be anywhere along the anterior-posterior 
axis of the embryo. Mathematically this is equivalent to 
saying that, a priori, the position of the cell is drawn 
from a distribution of possibilities P x (x); in the simplest 
case this probability distribution is uniform, but it also 
is possible that cells vary in density along the embryo's 
axis. Once we observe the expression level g, we still 
don't know the precise position x of the cell, but our 
uncertainty is greatly reduced. In Fig [T] we illustrate 
this idea using the gap gene hunchback (hb). Expression 
levels of hb are known to vary systematically along the 
anterior-posterior axis of the Drosophila embryo, but we 
also know that expression levels can be variable across 
cells in the same position, both within a single embryo 
and across multiple embryos. Thus, if we make a "slice" 
through the expression profile at some particular level g, 
we can't point uniquely to the position x of the nucleus 
in which the Hunchback protein has that exact concen- 
tration. Instead there is a range of positions which are 
consistent with the value of g, and we can summarize 
this range of possibilities by the conditional probability 
distribution, P(x\g), that a cell with expression level g 
will be found at position x. For all values of g that occur 
in the embryo, we see that this conditional distribution is 
narrower or more concentrated that then nearly uniform 
distribution P x (x). 

The probability distributions P x (x) and P{x\g) pro- 
vide the ingredients we need in order to make a math- 
ematically precise version of the qualitative statement 
that "the expression level g of a gene provides informa- 
tion about the position x of the cell." Crucially, the 
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FIG. 1. Positional information carried by the expression 
of Hunchback. Upper left panel: optical section through the 
midsagittal plane of a Drosophila embryo with immunofuores- 
cence staining against Hb protein; scale bar is 100 fim. Lower 
panel left: normalized dorsal profiles of fluorescence intensity, 
which we identify as Hb expression level g, from 24 embryos 
(light red dots) selected in a 30 to 40 min time interval af- 
ter the beginning of nuclear cycle 14 (see Methods). Means 
and standard deviations are plotted in darker red. Consider- 
ing all points with g = 0.1, 0.5, or 0.9 yields the conditional 
distributions P(x\g) shown at right. Note that these distribu- 
tions are much more sharply concentrated than the uniform 
distribution P x (x), shown in light grey; correspondingly the 
entropies Sf = S[P(x\g)] are very much smaller than the en- 
tropy Si = S[P x (x)]. For each g, we note the reduction of 
uncertainty in x by reading out g, AS = Si — S{. Upper right 
corner: variations in expression level around the mean at each 
position, estimated by the distribution of normalized relative 
expression, given by A = [g — g(x)]/a g (x) (red circles with 
standard errors of the mean). Solid line is a zero mean/unit 
variance Gaussian. Details of staining, imaging, age and ori- 
entation selection, normalization and entropy estimation are 
given in Methods. 



foundational result of Shannon's information theory is 
that there is only one way of doing this that is consis- 
tent with simple and plausible requirements, for example 
that independent signals should give additive information 

USED]. 

For any probability distribution we can define an en- 
tropy S, which is the same quantity that appears in sta- 
tistical mechanics and thermodynamics; for the two dis- 
tributions here we have 



S[P x (x)} =-l dxP x (x) \og 2 [P x (x)]bits, (1) 
S[P(x\g)] =- j dxP(x\g)\og 2 [P(x\g)}bits. (2) 



For example, if we measure x from to L along the 
length of embryo, then a uniform distribution of cells 
corresponds to P x (x) — l/L, and this has the maxi- 
mum possible entropy S[P x (x)]. The intuition that the 
conditional distribution P(x\g) is narrower or more con- 
centrated than P x {x) is quantified by the fact that the 
entropy 5[P(a;|5)] is smaller than S[P x (x)], and this re- 
duction in entropy is exactly the information that ob- 
serving g provides about x, here measured in bits. As 
an example, if observing the expression level g tells us, 
with complete certainty, that the cell is located in a 
small region of size Ace, then the gain in information is 
I = S[P x (x)} - S[P(x\g)} = log 2 (X/Ax) bits. 

If we choose a cell at random, we will see an expression 
level g drawn from the distribution P g (g). The average 
information that this expression level provides about po- 
sition is then 



dgP g (g) (S[P x (x)]-S[P(x\g)}), 



dg J dxP(g,x) log 2 



(3) 
, (4) 



where P{g, x) is the joint probability of observing a cell 
at x with expression level g, and we have rearranged the 
terms to emphasize the symmetry — information which 
the expression level provides about the position of the 
cell is, on average, the same as the information that the 
position of the cell provides about the expression level. 
This average information is called the mutual informa- 
tion between g and x. Again we emphasize that this 
measure of information is not one among many equally 
good possibilities, it is unique. 

Because information is mutual, we can also write I g ^ x 
in terms of the distribution of expression levels g that we 
find in cells at a particular position, P(g\x), 



Ig^ X = / dxP X {x){S[Pg{g)] 



S[P{g\x)]). (5) 



This emphasizes that the amount of information that can 
be conveyed is limited both by the overall dynamic range 
of expression levels, which determines S[P g (g)], and by 
the variability or noise in expression levels at a fixed po- 
sition, which is measured by 5[P(g|a;)]. It will be useful 
that the distribution of expression levels at one point, 
P(g\x), is approximately Gaussian, as shown at the up- 
per right in Fig [l] 

In what follows we will use Eq ^ to make a "direct" 
measurement of information, while Eq ^ invites to try 
and "decode" the information carried by the expression 
levels to recover estimates of the position x of each cell. 
Each approach has a natural generalization to the case 
where information is conveyed not by the expression level 
of one gene but by the combined expression levels of mul- 
tiple genes {gi}, and we will explore this as well. It is 
important to emphasize that the number of bits of infor- 
mation carried by the gene expression levels has mean- 
ing independent of the mechanisms by which this coding 
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is established. Thus, at one extreme, it could be that 
each cell sets its expression levels independently in re- 
sponse to some primary morphogen (such as Bicoid in 
the Drosophila embryo [2TM23| h while at the other ex- 
treme the spatial patterns of expression could arise en- 
tirely from communication between neighboring cells, in 
a Turing-like mechanism [2H ■ In these different ex- 
tremes, the precise value of the positional information 
places different quantitative constraints on the underly- 
ing mechanisms, but in all cases the number of available 
bits tells us about the reliability and complexity of the 
pattern that can be constructed from the local expression 
levels alone. 



III. INFORMATION CARRIED BY SINGLE 
GAP GENES 



Estimating the mutual information that one gene ex- 
pression level provides about position requires, from Eq 
([5]) , that we obtain a good estimate of the conditional dis- 
tribution P{g\x). Using immunofluorescent staining, we 
can measure g vs. x along the anterior-posterior axis of 
single Drosophila embryos, and by making such measure- 
ments on multiple embryos, as shown in Fig[l] we obtain 
many samples of the expression level at corresponding 
positions, and then from these samples we can build up 
an estimate of the distribution P(g\x). Because expres- 
sion profiles vary systematically with time during nuclear 
cycle 14, it is important to make these measurements on 
embryos in a limited time class, which, we do by taking 
the length of the cellularization membrane as a proxy for 
time (|26j; see Methods). We also confine our attention 
to the central 80% of the anterior-posterior axis, both 
because quantitative imaging at the poles is more diffi- 
cult and because we know that there are additional genes 
associated specifically with terminal patterning. 

As has been addressed in other contexts (see Meth- 
ods) , care is required to be sure that the finite number of 
samples we collect is sufficient to get a reliable estimate 
of I(g;x), but once we have control over the potential 
systematic errors the statistical errors in our measure- 
ments are very small. Analysis of the data in Fig [T] shows 
that the expression level of Hunchback provides I giib ^ x ~ 
2.26 ±0.04 bits of information about the position of a cell 
along the middle 80% of the anterior-posterior axis. We 
can repeat this analysis for the gap genes kriippel, gi- 
ant and knirps, in addition to hunchback, and we find 
I gKt ^ x = 1.95 ±0.07 bits, I gGt ^ x = 1.84 ± 0.05 bits, and 
I gKsi ^ x = 1.75 ±0.05 bits. 

In all cases, the expression of a single gene carries much 
more than one bit of information, indeed more nearly two 
bits. The conventional view of the gap genes is that they 
are characterized by domains of expression, with bound- 
aries, and the sharpness of the boundary often is taken as 
a measure of precision. But if the patterns of expression 
were perfect on/off domains with infinitely sharp bound- 
aries, then the expression level could provide at most one 
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FIG. 2. Reproducibility of multiple pattern elements along 
the anteriorposterior axis. Top panel: optical section through 
the midsagittal plane of a Drosophila embryo with immuno- 
fuorescence staining against Eve protein; scale bar is 100 /im. 
Middle panels: normalized dorsal profiles of fluorescence in- 
tensity from 12 embryos selected in a 40 to 50 min time win- 
dow after the beginning of nuclear cycle 14 (light blue lines) ; 
dorsal profile of top panel embryo is in darker blue. Zooming 
in on a single peak (as shown at right), we can measure the 
standard deviation of both the expression level and position 
of this element in the pattern. Bottom panel summarizes re- 
sults from such measurements on Even-skipped (blue) and 
Runt (magenta), plotting the standard deviation of the posi- 
tion o~ x as a function of the mean position x, together with 
a similar measurement on the reproducibility of the cephalic 
furrow. Note that all of the elements are positioned with 1% 
accuracy or better. 



bit of information about position. Our result that gap 
genes provide nearly two bits of information about po- 
sition demonstrates that intermediate expression levels 
are sufficiently reproducible from embryo to embryo that 
they carry significant amounts of positional information, 
and that the view of domains and boundaries misses al- 
most half of this information. 



IV. HOW MUCH INFORMATION DOES THE 
EMBRYO USE? 

If the expression profile of each gap gene were described 
by on/off domains with sharp boundaries, not only would 
a single gene carry at most one bit of information, four 
genes taken together could carry at most four bits — and 
this would happen only if the spatial arrangement of 
the different expression domains were carefully aligned 
to minimize redundancy. Four bits of information corre- 
sponds to, at most, 16 reliably distinguishable states en- 
coded by these genes, which seems small compared with 
the complexity of the pattern that eventually forms. But 
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how much information does the embryo really need, or 
use? At best, every nucleus could be labelled with a 
unique identity, so that with N nuclei the embryo could 
make use of log 2 N bits. Along the anterior-posterior 
axis, we can count nuclei in a single mid-sagittal slice 
through the embryo, and in the middle 80% of the em- 
bryo where the images are clearest we have N = 58 ± 4 
along the dorsal side and N = 59 ± 4 along the ventral 
side, where the error bars represent standard deviations 
across a population of 57 embryos in nuclear cycle 14, 
corresponding to 5.9 ± 0.1 bits of information. But do 
individual cells in fact "know" their identity? More pre- 
cisely, are the elements of the anterior-posterior pattern 
specified with single cell resolution? 

There are several experiments suggesting that elements 
of the final body plan of the maggot can be traced to 
identifiable rows of cells along the anterior-posterior axis 
[27] . which is consistent with the idea that each row of 
cells has a reproducible identity. More quantitatively, we 
can ask about the reproducibility of various pattern el- 
ements in early development, elements that appear not 
long after the expression patterns of the gap genes are 
established. A classic case is the cephalic furrow, which 
can be observed in live embryos and is known to have a 
position along the anterior-posterior axis that is repro- 
ducible with ~ 1% accuracy (see, for example, Ref [§])■ 

Is the cephalic furrow special, or can the embryo more 
generally position pattern elements with ~ 1% accuracy? 
The striped patterns of pair rule gene expression allow 
us to ask about the position of multiple pattern ele- 
ments, seven peaks and six troughs of expression along 
the anterior-posterior axis. As shown in Fig[2j all of these 



elements have positions that are reproducible to within 
1% of the embryo length. This strongly suggests that all 
cells "know" their position along the anterior-posterior 
axis with ~ 1% precision. 



V. DECODING THE POSITIONAL 
INFORMATION CARRIED BY MULTIPLE 
GENES 

Do the four gap genes, taken together, carry enough 
information to specify position with ~ 1% accuracy? To 
answer this, it is useful to look more directly at how the 
information is encoded. We observe the expression levels 
<7i, with i = 1, 2, 3, 4. At each point x there are average 
values of these expression levels g~i(x), and there are fluc- 
tuations Sgi . Let us assume that these fluctuations have 
a Gaussian distribution. If we look just at one gene, this 
means that the statistics of the fluctuations are described 
completely by the mean and the variance af(x), so that 
if we look at the same position x in many embryos we 
will see a distribution of expression levels 



P(9i\x) - 



1 



: exp 



2<rf(x) 



(6) 



and this is in good agreement with the measurements in 
Fig[T] If we look at many genes simultaneously, we have 
not just the variances of each gene but also the corre- 
lations or covariances among the genes, which define a 
matrix Cij(x). The joint distribution of expression levels 
at one point is then 



P({9i}\x) = 



v /(2 7 r) 4 detC* 



exp 



(7) 



where C~ 1 denotes the inverse of the matrix C and det C 
denotes its determinant. But to "read" the information 
carried by the expression levels, we need to ask for the 
distribution of positions that are consistent with a par- 
ticular set of expression levels that we might observe. By 
Bayes' rule, this can be written as 



P(x\{gi}) = 



P({ 9i }\x)P x (x) 
P 9 ({9i}) ' 



(8) 



where P x (x) is, as before, the (nearly uniform) distribu- 
tion of cell positions and P g ({gi}) is the (joint) distri- 
bution of expression levels averaged over all cells in the 
embryo. 

If the noise levels are small, then P(a;|{<?i}) will be 
sharply peaked at some £*({<?i}) which is our best esti- 
mate of the position given our observations on the ex- 
pression levels. Expanding around this estimate, we find 



that the distribution is approximately Gaussian, 



P(x\{9i}) 



: exp 



(x - x*({g j})) 
2al 



2 1 



(9) 



where the error in our position estimate is defined by 



T 2 

X i,j = l 



dgijx) 
dx 



dg\{x) 



« dx 



(10) 



Equation ( 10 1 tells us the precision with which expres- 



sion levels encode position: observing the expression lev- 
els {gi} allows us (or the cell!) to specify position with an 
"error bar" a x . Note that this error could be different at 
different points in the embryo, so really we should write 
a x {x). Checking our intuition, we see that this error bar 
is smaller when the variability in expression is smaller 
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FIG. 3. Simultaneous immunostaining of Hunchback, Kruppel, Giant and Knirps. Top panel: absorption (dashed lines) and 
emission (plain lines) spectra of the secondary dies used for simultaneous immunostaining of 4 proteins, the laser excitation 
wavelength (in black) and the position and bandwidth of the filters for each detection channel. Below are optical sections 
through the midsagittal plane of a single Drosophila embryo with co-immunofuorescence staining against Knirps (green), 
Kruppel (yellow), Giant (orange) and Hunchback (red); scalebar is 100 jj,m. To estimate the crosstalk for each channel, we 
compare the intensity profile in the sample embryo (plain line) with a control embryo of similar age and orientation for which 
all of the channels but the considered one have been co-stained (dashed line). The ratio between the two curves is plotted in 
grey; note the different scale at right. The bottom panels show the dorsal expression levels of the four gap genes for 24 embryos 
(light colors). The dorsal expression levels of the sample embryo are plotted in darker colors. 



(smaller C), when the mean spatial variations in expres- 
sion levels are stronger (larger dgi/dx), or when we can 
sum over more information carried by more genes. We 
can define a similar quantity based on measurements of 
a single gene, 



1 



a x (x) 



dgi(x) 



dx 



1 



0"i(x)' 



(11) 



and this construction is shown schematically at the top 
of Fig |4j Note that when a x is small, we can justify 
our approximation that P(x\{gi}) is sharply peaked, but 
when a x becomes large it is more rigorous simply to say 
that we don't have much information about x, rather 



than trying to give a more quantitative interpretation. 



Importantly, all the terms in Eq ( 10 1 are experimen- 



tally accessible. Measurements of the average expression 
profiles g~i(x) are standard. Ideally, to measure the co- 
variance matrix C;j(x) we should observe all four genes 
at once, in multiple embryos, and such experiments are 
shown in Fig [3j Alternatively, one can make measure- 
ments in which pairs of genes i,j are stained, and each 
such experiment contributes to estimates of the matrix 
elements Ca, Cjj, and Cy. With care, as described in 
Methods, such pairwise experiments can be merged to 
give the same results as the more direct quadruple stain- 
ing. The major difficulty in the quadruple staining exper- 
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FIG. 4. Positional error as a function of position. Upper left 
panel: geometrical interpretation of the positional error of a 
single gene at a given position. a x (x) is proportional to the 
reproducibility of the profiles and is inversely proportional 
to the derivative of the mean profile. The upper right panel 
summarizes the error in positional estimates based on the 
Hunchback level for a hundred points along the anteroposte- 
rior axis. Lower panel: error in positional estimates based on 
the expression levels of the four gap genes together (in black), 
from Eq (10 1; error bars are from bootstrapping. For refer- 



ence, positional errors based on individual expression levels 
are plotted in lighter colors in the background. Note that the 
net positional error is nearly constant and equal to 1% of the 
total egg length. 



iment is to avoid spectral crosstalk among the different 
fluorescence signals, but as noted in the Methods modest 
amounts of crosstalk actually don't change our estimate 
of a x . 

Measurements of a x are summarized in Fig [4] Re- 
markably, the reliability of position estimates based on 
the four gap genes is ~ 1%, almost precisely equal to 
the observed reproducibility with which pattern elements 
are positioned along the anterior-posterior axis. This is 
strong evidence that the gap genes, taken together, carry 
the information needed to specify the full pattern. Fur- 
ther, this positional accuracy is almost constant along the 
length of the embryo, which again is consistent with what 
we see in Fig [2] This constancy emerges in a nontrivial 
way from the expression profiles, the noise levels, and the 
correlation structure of the noise. If we try to make esti- 
mates based on one gene, we can reach ~ 1% accuracy in 
a very limited region of the embryo, and estimates from 
the different genes have their optimal precision in differ- 
ent places. The detailed structure of the spatial profiles 
insures that these signals can be combined to give nearly 



constant accuracy. 

If the errors in estimating position really are Gaussian, 
as in Eq ([9]), then we can substitute into Eq Q to show 
that / = (log 2 [L/ (c x v27re)]), where L is the length of 
the embryo and (• • • ) denotes an average over the pos- 
sible position dependence of the error a x . Computing 
this average, we have / = 4.57 ± 0.02 bits. On the other 
hand, we can use the distribution of expression levels at 
each position, Eq Q, to compute the information di- 
rectly as in Eq and we find I = 4.97 ± 0.23 bits. The 
agreement between these estimates supports our approx- 
imations, and gives us confidence that the measurement 
of a x if Fig [4] really does characterize the encoding of 
positional information by the gap genes. 



VI. A SIGNATURE OF OPTIMIZATION? 

The discussion thus far concerns the amount of infor- 
mation that actually is transmitted by the levels of gap 
gene expression. But we know that the capacity to trans- 
mit information is strictly limited by the available num- 
bers of molecules, and that significant increases in infor- 
mation capacity would require vastly more than propor- 
tional increases in these numbers |llj . Given these limi- 
tations, however, cells can still make more or less efficient 
use of the available capacity. To maximize efficiency, the 
input/output relations and noise characteristics of the 
regulatory network must be matched to the distribution 
of input transcription factor concentrations |15j . This 
matching principle has a long history in the analysis of 
neural coding [28 450] . and in Ref [TS] it was suggested 
that the regulation of Hunchback by Bicoid might pro- 
vide an example of this principle. Here we consider the 
generalization of this argument to the gap gene network 
as a whole. 

If we imagine that there is a single primary morphogen, 
then the expression levels of the different gap genes, taken 
together, can be thought of as encoding the concentra- 
tion c of this morphogen. By analogy with Eq (10 1, 



these expression levels can be decoded with some accu- 
racy <7^ ff (c), which itself depends on the mean local con- 
centration. The key result of Ref [T5] is that, when noise 
levels are small, all the "symbols" in the code should be 
used in proportion to their reliability, or in inverse pro- 
portion to their variability. Thus, if we point to a cell 
at random, we should see that the concentration of the 
primary morphogen is drawn from a distribution 



input 



(c) = 



1 



T eff 



(12) 



where the constant Z is chosen to normalize the distri- 
bution. But the input is a morphogen, so its variation is 
connected with the physical position x of cells along the 
embryo: we should have c = c(x). Then if the cells are 
distributed uniformly along the length of the embryo, the 
probability that we find a cell at x is just P{x) = 1/L, 
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and hence we must have 



-Pin P ut(c)cfc = P(x)dx = 



dx 



=> finput(c) = 



dc(x) 



dx 



(13) 
(14) 



We have two expressions for the distribution of input 
transcription factor concentrations: Eq (14 1, which ex- 



presses the role of the input as morphogen, encoding po- 
sition x, and Eq ( 12 ), which expresses the solution to the 



problem of optimizing information transmission through 
the network of genes that respond to the input. Putting 
these expressions together, we have 



Z 
L 



1 



af(c) 



dc(x) 



dx 



(15) 



where in the last step we recognize the equivalent posi- 



tional noise a x (x) by analogy with Eq (10 1. Thus, opti- 



mizing information transmission predicts that the posi- 
tional uncertainty <j x {x) will be constant along the length 
of the embryo, as observed in Fig|4] A more detailed ver- 
sion of this argument is given in the Appendix. 



levels carry so much information, so that an enormously 
precise pattern is available very early in development. 

The information that gene expression levels can carry 
about position is limited by noise. In particular, both 
because the concentrations of transcription factors are 
low, and because the absolute copy numbers of the out- 
put proteins are small, there are physical sources of noise 
that cannot be reduced without the embryo investing 
more resources in making these molecules. Given these 
limits, it still is possible to transmit more information 
through the gap gene network by "matching" the dis- 
tribution of input signals to the noise characteristics of 
the network. Although this matching condition is in gen- 
eral complicated, in the limits that the noise is small it 
can be expressed very simply: the density of cells along 
the anterior-posterior axis should by inversely propor- 
tional to the precision with which we can infer position 
by decoding the signals carried n the gap gene expres- 
sion levels. Since cells are almost uniformly distributed 
at this stage of development, this predicts that an opti- 
mal network would have a uniform precision, and this is 
what we find. This uniformity emerges despite the com- 
plex spatial dependence of all the ingredients, and thus 
seems likely to be a signature of selection for optimal 
information transmission. 



VII. DISCUSSION 
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The final result of embryonic development appears pre- 
cise and reproducible. Less is known quantitatively about 
the degree of this precision, and about the time at which 
precision first becomes apparent. Our central result is 
that, in the early Drosophila embryo, the patterns of gap 
gene expression provide enough information to specify 
the positions of individual cells with a precision of ~ 1% 
along the anterior-posterior axis. This is the same pre- 
cision with which subsequent pattern elements are spec- 
ified, from the pair rule expression stripes through the 
cephalic furrow, so that all the required information is 
available from a local readout of the gap genes. 

The precise value of the information that we observe 
is also interesting. It corresponds to being able to locate 
any nucleus with an error bar that is smaller than the 
distance to its neighbor, but the total number of bits is 
not quite large enough to specify the position of every 
cell uniquely. The difference is that when we make an 
estimate with error bars, the estimate comes from a dis- 
tribution with tails, and the (small) overlap of the tails of 
these distributions means that one cannot quite identify 
every cell. It is possible that cells in fact do not quite have 
unique identities, or that the missing information is hid- 
ing in correlations among the errors at different points: 
although the gap genes encode position with an error bar, 
the difference between positions coded by expression lev- 
els in neighboring cells could have a much smaller error 
bar. While further experiments are required to settle this 
issue, we find it remarkable that the gap gene expression 
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APPENDIX 

Here we give a detailed version of the arguments lead- 
ing to Eq ( 15 1 . Consider the case where information flows 



from a single input transcription factor (such as Bicoid) 
to a set of K output genes (the gap genes). The con- 
centration of the input is c, and the output genes have 
expression levels gi, g2, ■ • • , 9k [TBHISj - Different cells 
in the embryo experience different values of c, depending 
on their position, and if we choose a cell at random it 
sees a concentration drawn from the distribution P; n (c). 
The network responds to this input, generating expres- 
sion levels that are drawn from the distribution P({gi}\c); 
it will also be useful to define the (joint) distribution of 
output expression levels, 



iW{<?i}) = J dcP in (c)P({ 9i }\c). 



(16) 



The information that flows from input to output can then 
be written as the difference of entropies, as in Eq (§, 



/({$}; c) = - J dcP in (c)\og 2 P ia (c)- J d K gP Q - 



,({gi})S[P(c\{ gi })}, 



(17) 



where, from Bayes' rule, we have 

P({ 9i }\c)P in (c) 



P(c\{ 9i }) = 



Poutttgi}) 



(18) 



The transmitted information I({gi};c) depends both 
on the characteristics of the gene network, expressed in 
P({gi}\c), and on the distribution of input signals, Pi n (c). 
In particular, the irreducible noise associated with the 
finite number of available molecules is encoded by the 
details of P({g;}|c). Given these constraints it still is 
possible to maximize information transmission by proper 
choice of the input distribution P33 HD] • In general this 



I 

optimization is a hard problem, but we can make progress 
if we assume that the noise is small, and we will argue 
that this is a good approximation. 

In Eq (17), we need to take an average over the full 
distribution of output expression levels, -Pout({ffi}), This 
distribution is broadened by two effects. First, the inputs 
c are varying, and the outputs would vary in response. 
Second, even when the input c is fixed, the outputs {<7i} 
may vary because of noise. We will assume that noise 
is small in the sense that the first effect is much larger 
than the second, so that we can average over outputs by 
assuming that the output is always equal to its average 
value, <?i = g~i(c), and then averaging over the input c. In 
this approximation, the information becomes 



I=-J dcP in (c) log 2 P in (c) - J dcP in (c)S { c c 2 ld ({ 9i = 5i(c)}), 



(19) 



where S^ nd ({gi}) = S[P(c\{ gi })}. To find the distribu- 
tion of inputs that maximizes the information, we intro- 
duce as usual a Lagrange multiplier to fix the normaliza- 
tion of Pj n (c) and solve 



SPin(c) 



I- A dcP in (c) 



0. 



(20) 



The result is 



exp 



-(ln2)5^ d (({. 9i = 5i( C )})l, (21) 



where is Z chosen to normalize the distribution. The only 
approximation we have made thus far is to assume that 
the noise is small. But if the noise is also approximately 
Gaussian — given knowledge of the gene expression levels 
{<7i}, we know the input concentration to within some 
error bar cr® ff (c), which itself depends on the actual value 



of the input — then S 



(c) 
cond 



f (c)], and 



z 



,eff 



(22) 



corresponding to Eq ( 12 ) in the text. As discussed in Rcf 
[15] . this tells us that the system can optimize informa- 
tion transmission by using the "symbols" c in proportion 
to their reliability. 

Notice that the size of the noise in the system can be 
summarized by o~ x itself. Not only do we find, experimen- 
tally, that this is nearly constant, it is also very small, and 
in particular smaller than the distances over which the 
output of any single gap gene varies significantly. Thus, 



in retrospect, the effective noise really is small, as as- 
sumed above, which justifies the approximation leading 
to Eq (21 1. This derivation can be generalized to cases 



where there are multiple independent morphogen inputs, 
each varying along x. 



METHODS 

Fixation and staining. All embryos were collected at 
25 C and dechorionated in 100% bleach for 2 minutes, 
then heat fixed in a saline solution (NaCl, Triton X-100) 
and vortexed in a vial containing 5 mL of Heptane and 5 
mL of methanol for one minute. They were then rinsed 
and stored in methanol at -20 C. Embryos were labeled 
with fluorescent probes. We used rat anti-Kni, guinea pig 
anti-Gt, rabbit anti-Kr (gift of C. Rushlow), and mouse 
anti-Hb. Secondary antibodies were respectively conju- 
gated with Alexa-488 (rat), Alexa-568 (rabbit), Alexa- 
594 (guinea pig) and Alexa-647 (mouse) from Invitro- 
gen. Embryos were mounted in AquaPolymount from 
Polysciences, Inc. 

Imaging and profile extraction. All embryos where im- 
aged on a Leica SP5 laser-scanning confocal microscope 
and image analysis routines were implemented in Mat- 
lab software (MATLAB, MathWorks, Natick, MA). Im- 
ages were taken with a Leica 20x HC PL APO NA 0.7 

011 immersion objective, and sequential excitation wave- 
lengths of 488, 546, 594 and 633 nm. For each embryo, 
three high-resolution images (1024 x 1024 pixels, with 

12 bits and at 100Hz) were taken along the anteropos- 
terior axis (focused at the midsagittal plane) at 1.7x 
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magnified zoom and three times frame-averaged. With 
these settings, the linear pixel dimension corresponds to 
0.44 ± 0.01 fim. Profiles were extracted by sliding, in 
software, a disk of the size of a nucleus along the edge 
of the embryo in the midsagittal plane and computing 
the average intensity of its pixels. The coordinates of 
the disk centers were projected on the anterior-posterior 
and dorso-ventral axes of the embryo. Two curves, cor- 
responding to the dorsal and ventral sides of the embryo, 
were constituted. For consistency, only dorsal profiles are 
used in our analysis. We follow the methods of Ref [5] to 
convert measured fluorescence intensities into normalized 
protein concentrations. 

Determining the age of the embryo. The time since ini- 
tiation of nuclear cycle 14 was determined by the length 
of the dorsal cellularization membrane (5^1 EI] • A series 
of N = 10 brightfield movies of wildtype OreR embryos 
was used to obtain a calibration of cellularization pro- 
gression. The measured length of each immunostained 
embryo used in our analysis was compared with the ref- 
erence to convert length into time. The errorbar in age es- 
timation by this method is ±3 min. Embryos were sorted 
according to five time intervals (0-10 mins, 10-30 mins, 
30-40 mins, 40-50 mins and 50-60 mins), and our analysis 
here focuses on the 30-40 min class. 

Information in single genes. Measurements on the ex- 
pression profiles of a single gene in multiple embryos can 
be thought of as providing many samples out of the joint 
distribution P{g, x). To compute the mutual information 
between g and x, we discretize the two continuous axes 



into a number of bins; along the g axis we use these bins 
adaptively, so that the histogram of g in these bins is 
nearly flat. We then take the counts in each bin as an 
estimate of the probability, compute the information and 
examine the dependence on the number of bins and the 
number of samples. Following Refs [321 1331, we search 
for expected systematic dependencies, and extrapolate 
to the limit where the number of bins and samples both 
become large. We can obtain an upper bound on the in- 
formation by assuming that the conditional distribution 
P{g\x) is Gaussian, and we can obtain an approxima- 
tion to the information by taking this Gaussian approx- 
imation through to the construction of P g (g); all these 
estimation procedures agree within error bars. 

Analysis of multiple genes. With simultaneous mea- 
surements of expression levels for multiple genes, we can 
estimate the information that they carry jointly. The 
difficulty is that the space of expression levels is now 
much larger, but our number of samples is not. Having 
calibrated the Gaussian approximation against more di- 
rect calculations for single genes (above) , we can use this 
approximation in the multiple gene case, using Eq 
directly in the integrals that define I^ gi ^ x . We use a 
Monte Carlo method to evaluate these integrals numeri- 
cally, and estimate errors by a bootstrap method. Impor- 
tantly, if the signals that we observe are invertible linear 
combinations of the true signals — as might happen, for 
example, because of a small amount of crosstalk among 
the different imaging channels — then the invariance of 
the information to coordinate transformations tells us 
that this will not change our estimate. 
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